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Abstract 

We re-examine the notion of relative {p, e)- approximations, recently introduced in 
[CKMSOG] . and establish upper bounds on their size, in general range spaces of finite 
VC-dimension, using the sampling theory developed in |LLS01j and in several earlier 
studies [P0I86L IHau92l ITal94j . We also survey the different notions of sampling, used 
in computational geometry, learning, and other areas, and show how they relate to 
each other. We then give constructions of smaller-size relative {p, e)-approximations 
for range spaces that involve points and halfspaces in two and higher dimensions. 
The planar construction is based on a new structure — spanning trees with small rela- 
tive crossing number, which we believe to be of independent interest. Relative {p, e)- 
approximations arise in several geometric problems, such as approximate range count- 
ing, and we apply our new structures to obtain efficient solutions for approximate range 
counting in three dimensions. We also present a simple solution for the planar case. 



1 Introduction 

The main problem that has motivated the study in this paper is approximate range counting. 
In a typical example, one is given a set P of points in the plane, and the goal is to preprocess 
P into a data structure which supports efficient approximate counting of the number of 
points of P that lie inside a query halfplane. The hope is that approximate counting can be 
done more efficiently than exact counting. 

This is an instance of a more general and abstract setting. In general, we are given a 
range space {X,TZ), where X is a set of n objects and 7?. is a collection of subsets of X, 
called ranges. In a typical geometric setting, X is a subset of some infinite ground set U 

(e.g., U = R'^ and X is a finite point set in M'^), and 7^=|rnX rG 7^(7 |, where TZu is a 
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collection of subsets (ranges) of U of some simple shape, such as halfspaces, simplices, balls, 
etc. (To simplify the notation, we will use TZ and TZu interchangeably.) The measure of a 
range r G 7^ in X is the quantity 

ixnrl 



Xx) 



\x\ 



Given (X, 7^) as above, and a parameter < £ < 1, the goal is to preprocess X into a data 
structure that supports efficient queries of the form: Given r G TZu, compute a number t 
such that 

{l-e)X{x) <t<{l+e)X{x). (1) 

We refer to the estimate t\X\ as an e- approximate count of X fl r. 

The motivation for seeking approximate range counting techniques is that exact range 
counting (i.e., computing the exact value X{x), for r G 7^) is (more) expensive. For instance, 
consider the classical half space range counting problem |Mat92] , which is the main application 
considered in this paper. Here, for a set P of n points in W^, for d > 2, the best known 
algorithm for exact halfspace range counting with near-linear storage takes 0{n^~^^'^) time 
[Mat92] . As shown in several recent papers, if we only want to approximate the count, as in 
Eq. ([1]), there exist faster solutions, in which the query time is close to 0{n^~^^^'^^'^^) (and is 
polylogarithmic in two and three dimensions) |AH08l RSOSl [KS061 IKRSOSal [KRSOSbj . 

Notice that the problem of approximate range counting becomes more challenging as the 
size of X n r decreases. At the extreme, when |X fl r| < 1/e, we must produce the count 
exactly. In particular, we need to be able to detect (without any error) whether a given 
query range x is empty, i.e., satisfies X n r = 0. Thus, approximate range counting (in the 
sense defined above) is at least as hard as range emptiness detection. 

We make the standard assumption that the range space (X, 7^) (or, in fact, {U,7lu)) 
has finite VC-dimension 6, which is a constant independent of n. This is indeed the case 
in many geometric applications. In general, range spaces involving semi-algebraic ranges of 
constant description complexity, i.e., semi-algebraic sets defined as a Boolean combination of 
a constant number of polynomial equations and inequalities of constant maximum degree, 
have finite VC-dimension. Halfspaces, balls, ellipsoids, simplices, and boxes are examples of 
ranges of this kind; see [ChaOll IHW871 IMatQQl IPA95j for definitions and more details. 



Known notions of approximations. A standard and general technique for tackling the 
approximate range counting problem is to use ^-approximations. An (absolute-error) e- 
approximation for (X, 7^) is a subset Z C X such that, for each x E TZ, \Z{x) — X(r)| < e. 
In the general case, it is knowiil] that a random sample of size O (^) is an ^-approximation 
with at least some positive constant probability |LLS01t ITal94] , and improved bounds are 
known in certain special cases; see below for details. 

Another notion of approximation was introduced by Bronnimann et al. |Bro95[ IBCM99j . 
A subset Z C X is a sensitive e- approximation if for all ranges r G 7^, we have |X(r) — Z(x) | < 

(e/2) ^X(r)^^^ + e^. Bronnimann et al. present a deterministic algorithm for computing sen- 
sitive approximations of size 0(^log f), in deterministic time 0{5f\l/efHog\5/e)\X\. 

^Somewhat selectively; see below. 
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Another interesting notion of sampling, studied by Li et al. |LLS01] . is a {u, a) -sample. 
For given parameters a, z/ > 0, a sample Z O X is a. {u, Q;)-sample if, for any range r G 7^, we 
have dy{Z{x) ,X{x)) < a, where d^{x,y) = \x — y\ /{x + y + u). Li et al. gave the currently 
best known upper bound on the size of a sample which guarantees this property, showing that 
a random sample of X of size o(^-^ ^51og^ + log^ j is a (z/, a)-sample with probability 
at least 1 — q; see below for more details. 

Relative [p, £:)-approximation. In this paper, we consider a variant of these classical 
structures, originally proposed a few years ago by Cohen et al. [CKMS06] . which provides 
relative- error approximations. Ideally, we want a subset Z G X such that, for each r G 7?., 
we have 

(l-£)X(t)<Z(t)<(l+5)X(t). (2) 

This "definition" suffers however from the same syndrome as approximate range counting; 
that is, as \X fl r| shrinks, the absolute precision of the approximation has to increase. At 
the extreme, when Znr = 0, Xflr must also be empty; in general, we cannot guarantee 
this property, unless we take Z = X, which defeats the purpose of using small-size e- 
approximations to speed up approximate counting. 

For this reason, we refine the definition, introducing another fixed parameter < p < 1. 
We say that a subset Z G X is a. relative {p,e)- approximation if it satisfies Eq. (|2]) for each 
X E TZ with X{x) > p. For smaller ranges r, the error term eX{x) in Eq. ([2]) is replaced 
by ep. As we will shortly observe, relative {p, £:)-approximations are equivalent to {u, a)- 
samplings, with appropriate relations between p, e, and u, a (see Theorem 12. 9p . but this 
new notion provides a better working definition for approximate range counting and for other 
applications. 

Known results. As shown by Vapnik and Chervonenkis [VC71] (see also |Cha01t IMat99t 
IPA95] ). there always exist absolute-error e- approximations of size ^log^, where c is an 
absolute constant. Moreover, a random sample of this size from X is an e- approximation 
with constant positive probability. This bound has been strengthened by Li et al. |LLS01j 

(see also [Tal94] ) . who have shown that a random sample of size ^ ^5 + log ^ j is an e- 

approximation with probability at least 1 — q, for a sufficiently large (absolute) constant 
c. (Interestingly, until very recently, this result, worked out in the context of machine 
learning applications, does not seem to have been known within the computational geometry 
literature.) e-approximations of size log |) can also be constructed in deterministic time 

O (^6^^ (^logf) \) |Cha04] . 

As shown in |Cha01t ICha04t IMWW93] . there always exist smaller (absolute-error) e- 
approximations, of size 

of - log^-^/('''+^) -] 

where 6' is the exponent of either the primal shatter function of the range space {X, TV) (and 
then 6 = 2) or the dual shatter function (and then 6 = 1). The time to construct these 
improved ^-approximations is roughly the same as the deterministic time bound of |Cha04] 
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stated above, for the case of the dual shatter function. For the case of the primal shatter 
function, the proof is only existential. 

Consider next relative {p, £:)-approximations. One of the contributions of this paper is to 
show that these approximations are in fact just an equivalent variant of the {u, a)-samplings 
of Li et al. |LLS01] : see Section [2j As a consequence, the analysis of |LLS01J imphes that 
there exist relative (p, £)-approximations of size ^ log where c is an absolute constant. In 
fact, any random sample of these many elements of X is a relative (p, e)-approximation with 
constant probability. Success with probability at least 1 — g is guaranteed if one samples 



To appreciate the above bound on the size of relative (p, £:)-approximations, it is instruc- 
tive to observe that, for a given parameter p, any absolute error (erp) -approximation Z will 
approximate "large" ranges (of measure at least p) to within relative error e, as in Eq. ([2]), 
as is easily checked (and the inequality for smaller ranges is also trivially satisfied), so Z is 
a relative (p, e)-approximation. However, the Vapnik-Chervonenkis bound on the size of Z, 
namely, log J^, as well as the improved bound of |LLS01t ITal94] . are larger by roughly a 
factor of 1/p than the improved bound stated above. 

The existence of a relative (p, £)-approximation Z provides a simple mechanism for ap- 
proximate range counting: Given a range r G 7^, count Z (Ix exactly, say, by brute force in 
0(|Z|) time, and output fl r| ■ \X\ / \Z\ as a (relative) e-approximate count of X Ht. How- 
ever, this will work only for ranges of size at least pn. Aronov and Sharir |AS08j show that 
an appropriate incorporation of relative {p, 5)-approximations into standard range searching 
data structures yields a procedure for approximate range counting that works, quite effi- 
ciently, for ranges of any size. This has recently been extended, by Sharir and Shaul |SS09] , 
to approximate range counting with general semi-algebraic ranges. 

Our results. In this paper, we present several constructions and bounds involving relative 
{p, £:)-approximations. 

We first consider samplings in general range spaces of finite VC-dimension, and establish 
relations between several different notions of samplings, including [u, a)-samplings, relative 
{p, e)-approximations, and sensitive e-approximations. Our main observations are: 

(i) The notion of {p, e)-approximation is equivalent to that of (z/, a)-sample, by choosing u 
to be proportional to p and a proportional to e; see Theorem 12.91 

(ii) A sensitive (ei^)-approximation is also a relative (p, e)- approximation. 

(iii) The result of Li et al. |LLS01] is sufficiently powerful, so as to imply known bounds on 
the size of many of the different notions of samplings, including e-nets, e-approximations, 
sensitive e- approximations, and, as just said, relative (p, e)-approximations. Some of these 
connections have already been noted earlier, in |LLS01j and in |Har08] . We offer this portion 
of Section [2] as a service to the computational geometry community, which, as already noted, 
is not as aware of the results of |LLS01j and of their implications as the machine learning 
community. 

Next, we return to geometric range spaces, and study two cases where one can construct 
relative (p, e)-approximations of smaller size. The first case involves planar point sets and 
halfplane ranges, and the second case involves point sets in W^, d >3, and halfspace ranges. 
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>g- + log - j elements of X, for a sufficiently large constant c |LLS01j . 
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In the planar case, we show the existence, and provide efficient algorithms for the construc- 
tion, of relative (p, £:)-approximations of size 0^^473^ log . Our technique also shows in 

this case the existence of sensitive ^-approximations with improved quality of approxima- 
tion. Specifically, for a planar point set X, we show that there exists a subset Z <Z X oi size 

C>(^^log^/^i), such that, for any halfplane r, we have \X{x) -Z{x)\ < le^/^X{xf^^ + e\ 

(This new error term is indeed an improvement when X{x) > and is the same as the 
standard term otherwise.) 

In the planar case, the construction is based on an interesting generalization of spanning 
trees with small crossing number, a result that we believe to be of independent interest. 
Specifically, we show that any finite point set P in the plane has a spanning tree with the 
following property: For any k < \P\ /2, any k-shallow line (a line that has at most k points of 
P in one of the halfplanes that it bounds) crosses at most 0{\/k\og{n/k)) edges of the tree. 
In contrast, the classical construction of Welzl |Wel92] (see also |CW89] ) only guarantees 
this property for k = n; i.e., it yields the uniform bound 0{^/n) on the crossing number. 
We refer to such a tree as a spanning tree with low relative crossing number, and show how 
to use it in the construction of small-size relative {p, e)-approximations. 

Things are more complicated in three (and higher) dimensions. We were unable to extend 
the planar construction of spanning trees with low relative crossing number to (nor to 
higher dimensions), and this remains an interesting open problem. (We give a counterex- 
ample that indicates why the planar construction cannot be extended "as is" to 3-space.) 
Instead, we base our construction on the shallow partition theorem of Matousek |Mat91bj . 

and construct a set Z of size O ^^373^ log ^ j , which yields an absolute approximation error 

of at most ep for halfspaces that contain at most pn points. Note that this is the "wrong" 
inequality — to guarantee small relative error we need this to hold for all ranges with at least 
pn points. To overcome this difficulty, we construct a sequence of approximation sets, each 
capable of producing a relative e-approximate count for ranges that have roughly a fixed 
size, where these size ranges grow geometrically, starting at pn and ending at roughly n. 
The sizes of these sets decrease geometrically, so that the size of the ffist set (that caters to 

ranges with about pn points), which is O ^^373^ log , dominates asymptotically the overall 
size of all of them. We output this sequence of sets, and show how to use them to obtain an 
e-approximate count of any range with at least pn points. 

The situation is somewhat even more complicated in higher dimensions. The basic ap- 
proach used in the three-dimensional case can be extended to higher dimensions, using the 
appropriate version of the shallow partition theorem. However, the bounds get somewhat 
more complicated, and apply only under certain restrictions on the relationship between e 
and p. We refer the reader to Section 14. 2^ where these bounds and restrictions are spelled 
out in detail. 

The paper is organized as follows: In Section [2] we survey the sampling notions mentioned 
above, and show how they relate to each other. In Section [31 we describe how to build a 
small relative (p, £:)-approximation in the planar case, by ffist showing how to construct a 
spanning tree with low relative crossing number. In Section HI we extend the result to higher 
dimensions. In Section [5l we revisit the problems of halfplane and 3-dimensional halfspace 
approximate range counting, and provide algorithms whose query time is faster than those 



5 



in the previous algorithms|§ This section is somewhat independent of the rest of the paper, 
although we do use relative {p, 5)-approximations for the 3-dimensional case. We conclude 
in Section E] with a brief discussion of the results and with some open problems. 



2 On the relation between some sampling notions 

In this section we study relationships between several commonly used notions of samplings 
in abstract range spaces. In particular, we show that many of these notions are variants or 
special cases of one another. Combined with the powerful result of Li et al. |LLS01j . this 
allows us to establish, or re-establish, for each of these families of samplings, upper bounds 
on the size of samples needed to guarantee that they belong to the family (with constant or 
with high probability). 



Definitions. We begin by listing the various kinds of samplings under consideration. In 
what follows, we assume that (X, TV) is an arbitrary range space of finite VC-dimension 5. 

Definition 2.1 For a given parameter < e < 1, a subset Z C X is an e-net for [X^TZ) if 
r n Z 7^ 0, for every x eTZ such that X{x) > e. 

Definition 2.2 For a given parameter < e < 1, a subset Z C X is an e- approximation 
for {X,7l) if, for each t E TZ, we have |X(r) — Z(r)| < e. 

Definition 2.3 For given parameters < p,e < 1, a subset Z C X is a relative {p,b)- 
approximation for (X, TV) if, for each r G 7^, we have 

(i) (1 - e)X{x) < Z{t) < (1 + e)X{t), if X(r) > p. 

(ii) X(r) -ep< Z{x) < X{x) + ep, if X(r) < p. 

Definition 2.4 For a given parameter < e < 1, a subset Z C X is a sensitive e- 
approximation for (X, 7^) if, for each t eTZ, we have |Z(r) — X(r)| < |^X(r)^^^ + e 



Finally, for a parameter u > 0, consider the distance function between nonnegative real 
numbers r and s, given by 

|r — s\ 



dJr,s) 



r + s + u 

du{-,-) satisfies the triangle inequality [LLSOl] . and is thus a metric (the proof is straight- 
forward albeit somewhat tedious). 

Definition 2.5 For given parameters < a < 1 and > 0, a subset Z C X is a (z/, a)- 
sample for (X, 7^) if, for each range r G 7^, we have dy{Z{x) ,X(r)) < a. 

(Note that a > 1 is uninteresting, because is always at most 1.) 



^These results have recently been improved by Afshani and Chan |AC09] . at least in three dimensions, 
after the original preparation of the present paper. 
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Equivalence of relative (p, £:)-approximations and {u, a)-samples. We begin the anal- 
ysis with the following easy properties. The first property is a direct consequence of the 
definition of d^. 

Observation 2.6 Let a, u,fn,s be non-negative real numbers, with a < 1. Then di^{rn,s) < 
a if and only if 

_ ... 2a; \ au / 2a \ au 

s G 1 — m — , 1 + ??7 



1 + a J 1 + a \ 1 — a J 1 — a ^ 

Corollary 2.7 For any non-negative real numbers, v,a,m,'s, with a < 1, put 

la av , A / 2a av 1 + a 

A := m H and A := m H = A. 

1 + a 1 + a 1 — a 1 — a 1 — a 

Then we have: 

(i) If \s — m\ < A then di,{fn,s) < a. 
(a) If d^{fn,s) < a then \s — fn\ < A'. 

Lemma 2.8 If Y ^ X is a {v^a)-sample for {X,Tl), and Z C Y is a {v,a)-sample for 
{Y,7V), then Z is a {u, 2a) -sample for {X,IZ). 

Proof: An immediate consequence of the triangle inequality for d^. m 
The following theorem is one of the main observations in this section. 

Theorem 2.9 Let {X,IZ) be a range space as above. 

(i) If Z (1 X is a {i>, a) -sample for {X,TZ), with < a < 1/4 and v > 0, then Z is a 
relative (z/, 4a) -approximation for {X, TZ). 

(a) If Z is a relative {u, a) -approximation for {X,TZ), with < a < 1 and u > then Z is 
a {u, a) -sample for (X, TZ) . 

Proof: (i) By Corollary 12. 7( ii). we have, for each r G 7?., 

, , 2a au 8 4 

X{x) - Z{x) < X{x) + < -aX{x) + -au. 

' '1 — a 1 — a 6 6 

Thus, if^(r) > u then (1 - 4a)X(r) < Z{x) < (1 + 4a)X(r), and if X{x) < u then 
\X{x) — Z{x)\ < Aav, establishing the claim. 

(ii) If X{x) > V then 

I — . X — / ^ I — , X 2a — , , av 
X(r) - Z(x)\ < aX{x) < X{x) 



1 + a 1 + a 



which implies, by Corollary I2.7( i). that d^{X{x) , Z{x)) < a. 
If X(x) < V then 



|X(r) - Z{x)\ <av< a{X{x) + Z(r) + v) 



and the claim follows. 
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Corollary 2.10 For a range space {X,TZ), if Y (1 X is a relative {v^a)-sample for {X,Tl), 
and Z ^ Y is a relative {u, a) -sample for {Y,TZ), with < a < 1/8 and u > 0, then Z is a 
relative {u, 8a) -approximation for {X,Tl). 

Proof: By Theorem I2.9( ii) and Lemma |2.8[ Z is a (i/, 2a)-sample for {X,TZ). By Theo- 
rem [2]9]^i), it is then a relative (i^, 8a)-approximation for {X,7l). m 

We next recall the bound established in |LLS01] . and then apply it to Theorem 12.91 
Specifically, we have: 



Theorem 2.11 (i) (Li et al. |LLS01] ) A random sample of X of 



size 



^- d log - + log - 
a'^v \ V q 

for an appropriate absolute constant c, is a [v, a) -sample for {X,TZ) with probability at least 
1-q. 

(a) Consequently, a random sample of X of size 

51og- + log- 

e'^p VP q 

for another absolute constant c' , is a relative {p, e)- approximation for {X, TV) with probability 
at least 1 — q. 

We next observe that e-nets and e-approximations are special cases of (z/, a)-samples, 
where the second observation has already been made in |LLS01] . The bound on the size of 
(j/, a)-samples (Theorem I2.11( i)) then implies the known bounds on the size of e-nets (see 
|HW87j ) and of ^-approximations (see |LLS01j ). Specifically, we have: 

Theorem 2.12 Let {X,TZ) be a range space, as above, and let e > 0. 

(i) For any a < 1/2 and v = e, a {u, a) -sample from X is an e-net for {X,TZ). Con- 
sequently, a random sample of X of size ^5 log ^ + log ^ j j , with an appropriate 

choice of the constant of proportionailty, is an e-net for {X,7l) with probability at least 
1-q. 

(a) If a < s/3 and v < 1, then a {u,a)-sample from X is an e- approximation for {X,TZ). 

Consequently, a random sample of X of size ^5 + log0j, with an appropriate 

choice of the constant of proportionailty, is an e- approximation for {X,TZ) with proba- 
bility at least 1 — q. 

Proof: (i) We need to rule out the possibility that, for some range r G 7^, we have Z(r) = 
and X{x) > e. Since Z is an [e, a)-sample, we must then have 

X(x) 

< a, 



X(x)+e 
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which is impossible, since the fraction is at least 1/2. 

(ii) With this choice of parameters, we have, for any range r G 7?., 

|X(t) - Z(t)| < |(X(r) + Z(r) + !)<£, 

as desired. As noted, the bounds on the sample sizes follow Theorem I2.11( i). ■ 

We also note the following (weak) converse of Theorem I2.12( ii): If Z C X is an (az/)- 
approximation for {X,TZ) then Z is a (z/, a)-sample for (X, 7^). Indeed, we have already 
noted in the introduction that an (az/)-approximation for (X, 7^) is also a relative (z/, a)- 
approximation, so the claim follows by Theorem I2.9( ii). This is a weak implication, though, 
because, as already noted in the introduction, the bound that it implies on the size of {u, a)- 
samples is weaker than that given in jLLSOlj (see Theorem 12.111 (i) ) . 



Sensitive approximations. We next show that the existence of sensitive e-approximations 
of size log can also be established using (z/, a)-samples. The proof is slightly trickier 
than the preceding ones, because it uses the fact that a sample of an appropriate size is 
(with high probability) a (z/j, aj)-sample, for an entire sequence of pairs (i/j,Q;j). The bound 
yielded by the following theorem is in fact (slightly) better than the bound established by 
|Bro95} IBCM99j . as mentioned in the introduction. 

Theorem 2.13 Let {X,TZ) be a range space, as above, and let e > 0. A random sample 
from X of size 

o(J,(.,ogl + logl)). 

is a sensitive e- approximation, with probability > 1 — g. 

Proof: Put Ui = ie^/AOO, Ui = l/(4i)^/^ for i = 1, . . . , M = [400/^2] . Note that afz/^ = 
£^/800 for each i. 

Let Z he a. random sample of size 

^/ 1 1 , M 

m = U \ log — h log — 

Theorem 12 . 1 1 1 implies that, with an appropriate choice of the constant of proportionality, the 
following holds: For each i, Z is a. (z/j, aj)-sample, with probability at least 1 — 6/M. Hence, 
with probability at least 1 — 6, Z is a (z/j, aj)-sample for every i. 

Now consider any range x eTZ, and put r = X{x), s = Z{x). Let i be the index satisfying 
(z — 1)5^/400 < r < ie^/400. Assume first that z > 1, so we have |z/j < r < Ui, and thus 




v2r = . 

800 20 



(3) 



Since Z is a (z/j, aj) -sample, we have 



r — s 



r + s + z/j 



9 



If s < i/j then this imphes 



, 3e-v/r s , r- \ 



so sensitivity holds in this case. Otherwise, if s > z/j > r, then 

s - r < ai{r + s + Ui) =^ {1 - ai){s - r) < ai{2r + Ui) 

ai(2r + Ui) 

since ai < 1/2. Hence, by Eq. ([3]), we have 

. . e^/r S / r- \ 

s — r < GaiUi < 6 < -[\/r + e) , 

' ' ~ 20 2 ^ ^ 

so sensitivity holds in this case too. 

Finally, assume z = 1, so r < e^/400. In this case we have 

I r — t5 1 

dui{r, s) = — ■ ■ < ai = 1/2. 

r + s + ui 

li s < ui then 

3 e 

\r — s\ 

as required. If s > i^i then we have 



s — r < -{r + s + ui) , or s — r < 2r + z^i < 3z/i < ^ < ^ (a/t + e) , 

showing that sensitivity holds in all cases. ■ 

It is ineresting to note that the bound on the size of sensitive ^-approximations cannot be 
improved (for general range spaces with bounded VC-dimension) . This is because a sensitive 
e- approximation is also an e^-net, and there exist range spaces (of any fixed VC-dimension 
5) for which any e^-net must be of size VL{{5/e'^) log(l/£:)) |KPW92j . 



From sensitive to relative approximations. Our next observation is that sensitive ap- 
proximations are also relative approximations, with an appropriate calibration of parameters. 
In a way, this can be regarded as a converse of Theorem 12.131 which shows that a set which 
is simultaneously a relative approximation (i.e., a (nu, a)-sample) for an entire appropriate 
sequence of pairs of parameters is a sensitive approximation. Specifically, we have: 

Theorem 2.14 Let < e,^ < 1 he given parameters, and set e' = e^yp. Then, if Z (1 X is 
a sensitive e'- approximation for {X,Tl), it is also a relative [p^e)- approximation for {X,TZ). 

Proof: We are given that |X(r) - Z(r)| < | (x{xf^^ + e') for each r G 7^. 
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Now let r G 7?. be a range with X(r) > p, that is, X{x) = ap, for some a > 1. Then 
\X{x) - Z(t)| < ^(yS^ + ^v^) = ^ + |v^P < + «P < eX{x) . 
Similarly, if X[x) < p, then 

- m\ <'-^iV^ + eVp) = "-^^ < ep. 

Hence, Z is a relative {p, e)-approximation. ■ 

This observation implies that one can compute relative [p, £)-approximations efficiently, 
in a deterministic fashion, using the algorithms in |Bro95| IBCM99j for deterministic con- 
struction of sensitive approximations. We thus obtain the following result. 

Lemma 2.15 Let {X,TZ) be a range space with finite VC- dimension 6, where \X\ = n, and 
letO<e,p<l be given parameters. Then one can construct a relative {p,e)- approximation 

for (X,7^) o/szzeO(^log|), m 

min|o(5)3'^(^-^log^^ n, 0(n^+^)| 

deterministic time. 



3 Relative {p^ e)-approximations in the plane 

In this section, we present a construction of smaller-size relative {p, e)-approximations for the 
range space involving a set of points in the plane and halfplane ranges. The key ingredient 
of the construction is the result of the following subsection, interesting in its own right. 

3.1 Spanning trees with small relative crossing number 

We derive a refined "weight-sensitive" version of the classical construct of spanning trees 
with small crossing number, as obtained by Chazelle and Welzl jCW89] . with a simplified 
construction given in [Wel92] . We believe that this refined version is of independent interest, 
and expect it to have additional applications. 

In accordance with standard notation used in the literature, we denote from now on the 
underlying point set by P. 

We first recall the standard result: 

Theorem 3.1 ( |Wel92] ) Let P be a set of n points in W^. Then there exists a straight-edge 
spanning tree 7 of P such that each hyperplane in crosses at most 0{p}~^/'^) edges of 7. 
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Let P be a set of n points in the plane. For a line let (resp., wj) be the number 
of points of P lying above (resp., below or on) £, and define the weight of denoted by wi, 
to be Ymn.{w~^ ,wj}. 

Let Dfc = k) be the intersection of all closed halfplanes that contain at least n — k 
points of P. Note that, by the centerpoint theorem (see |Mat03] ). 2?^ is not empty for 
k < n/?). Moreover, 2?^ is a convex polygon, since it is equal to the intersection of a finite 
number of halfplanes. (Indeed, it is equal to the intersection of all halfplanes containing at 
least n — k points of P and bounded by lines passing through a pair of points of P.) 

The region can be interpreted as a level set of the Tukey depth induced by P; see 
[ABETOOj . 

Lemma 3.2 Let P be a set ofn points in the plane, (i) Any line i that avoids the interior of 
Dfc has weight wg < 2k. (ii) Any line i that intersects the interior ofD^ has weight wi > k. 

Proof: (i) Translate i in parallel until it supports 2)^. The new line i' must pass through a 
vertex v of 2)^ which is the intersection of two lines bounding two respective closed halfplanes, 
each having k points in its complement. Thus, the union of the complements of these two 
halfplanes contains at most 2k points, and it contains i' and i. Thus, i has at most 2k points 
on one of its sides. 

(ii) The second claim is easy: If the weight of i were at most k then, by definition, the 
interior of 2)^ would be completely contained on one side of i. m 

Lemma 3.3 The set P\ 2)^ can be covered by pairwise openly disjoint triangles Ci, . . . , Cu, 
each containing at most 2k points of P\'Dk, such that any line intersects at most 0{\og{n/k)) 
of these triangles. Moreover, Ci fl (92)^ ^ 0, for each i = 1, . . . ,u. 

Proof: We construct polygons Ci iteratively, as follows. Let and Xji be the two vertical 
lines supporting 2)^ on its left and on its right, respectively. The polygon Ci (resp., C2) is the 
halfplane to the left (resp., right) of Xl (resp., Xr). The construction maintains the invariant 
that the complement of the union of the polygons Ci, . . . , Cj constructed so far is a convex 
polygon Ki that contains 2)^ and each edge of the boundary of Ki passes through some 
vertex of 2)^, so that Ki \ 2)^ consists of pairwise disjoint connected "pockets". (Initially, 
after constructing Ci and C2, we have two pockets — the regions lying respectively above and 
below 2)fc, between and Xr.) 

Each step of the construction picks a pocket^] that contains more than 2k points of P, 
finds a line i that supports 2)^ at a vertex of the pocket, and subdivides the pocket into two 
sub-pockets and a third piece that lies on the other side of £. The line £ is chosen so that the 
two resulting sub-pockets contain an equal number of points of P. The third piece, which 
clearly contains at most 2k points of P (see Lemma [372]) . is taken to be the next polygon Cj+i, 
and the construction continues in this manner until each pocket has at most 2k points. We 
refer to the polygons Ci constructed up to this point as non-terminal. We then terminate the 
construction, adding all the pockets to the output collection of polygons, referring to them 
as terminal polygons. Note that each non-terminal Ci is a (possibly unbounded) triangle, 

•^Apologies for the pun. 
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Figure 1: The polygon and the decomposition of its complement. 



having a "base" whose relative interior passes through a vertex of 2)^, and two other sides, 
each of which is a portion of a base of an earlier triangle. The terminal polygons are pseudo- 
triangles, each bounded by two straight edges and by a "base" which is a connected portion 
of the boundary of D^, possibly consisting of several edges. See Figure [H 

We claim that each line i intersects at most 0{log{n/k)) polygons. For this, define the 
weight w{Ci) of a non-terminal polygon (7^, for z > 3, to be the number of points of P in 
the pocket that was split when Cj was created; the weight of each terminal polygon is the 
number of points of P that it contains, which is at most 2k. Define the level of C*j to be 

\og2w{Ci) . It is easily checked that £ crosses at most two terminal polygons (two if it 
crosses 2)^ and at most one if it misses 2?^), and it can cross both (non-base) sides of at 
most one (terminal or non-terminal) polygon. Any other polygon Ci crossed by I is such that 
^ enters it through its base, reaching it from another polygon whose level is, by construction, 
strictly smaller than that of Ci. Since there are only 0{\og{n/k)) distinct levels, the claim 
follows. 

It is easy to verify that the convex hulls Ci = CHiCi), . . . , C„ = CHiCu) are triangles 
with pairwise disjoint interiors, which have the required properties. ■ 



Lemma 3.4 Let 1 < k < n be a prespecified parameter. One can construct a spanning tree 
7 for P' = P \ T>k, such that each line intersects at most 0{\/k\og{n/k)) edges of 7. 

Proof: Construct the decomposition of P \ "Dk into u covering triangles Ci, . . . ,Cu, using 
Lemma 13.31 

For each i = 1, . . . ,u, construct a spanning tree Tj of P n Ci with crossing number 
0{k^^'^), using Theorem 13.11 In addition, connect one point of P fl Cj to an arbitrary vertex 
of dCi n dDk- Since Cn{P n Ci) and Cn{P D Cj) are disjoint, for i ^ j, it follows that no 
pair of edges of any pair of (the plane embeddings of the) trees among Ti, . . . , cross each 
other. 

Let G be the planar straight-line graph formed by the union of dT)/;, Ti, . . . , 7^, plus the 
connecting segments just introduced, and let 7* be a spanning tree of G] the vertex set of 
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T* contains P\Dk- 

Let £ be a line in the plane. The proof of the preceding lemma implies that i intersects 
at most 0(log(n/fc)) of the polygons Hence, i crosses at most two edges of dTi^, at most 
0(\og{n/k)) of the connecting segments, and it can cross edges of at most 0(log(n//c)) trees 
Tj, for z = 1, . . . , u. Since i crosses at most 0(/c^/^) edges of each such tree, we conclude that 
i crosses at most 0(\/fclog(n//c)) edges of 7*. 

Finally, we get rid of the extra "Steiner vertices" of T* (those not belonging to P \ 2)^) 
in a straightforward manner, by making T* a rooted tree, at some point of P \ D^, and by 
replacing each path connecting a point m G P \ to an ancestor v & P \ D^, where all 
inner vertices of the path are Steiner points, by the straight segment uv. This produces a 
straight-edge spanning tree T of P \ 2)^, whose crossing number is at most that of T*. ■ 

Theorem 3.5 Given a set P of n points in the plane, one can construct a spanning tree 7 
for P such that any line i crosses at most 0{y/wi\og{n/wi)) edges of 7. The tree 7 can be 
constructed in 0{n^~^'^) (deterministic) time, for any fixed 5 > 0. 

Proof: We construct a sequence of subsets of P, as follows. Put Pq = P. At the ith step, 
i > 1, consider the polygon Qi = D(Pj_i,2*), and let Pi = Pj_i fl Qi. We stop when 
Pj becomes empty. By construction, the ith step removes at least 2* points from Pj_i, so 
\Pi\ < \Pi~i\ ~ 2*, and the process terminates in O(logn) steps. 

For each i, construct a spanning tree Tj for Pj_i \ Qi, using Lemma [3.41 (with k = 2*). 
Connect the resulting trees by straight segments into a single spanning tree 7 of P. 

We claim that 7 is the desired spanning tree. Indeed, consider an arbitrary line £ of 
weight k. Observe that i cannot cross any of the polygons Qi, for i > U = [log2fc], since 
any line that crosses such a polygon must be of weight at least 2^~^^ > k, by Lemma [3.2( ii). 

Thus i crosses only the first U layers of our construction. Hence, the number of edges of 
7 that i crosses is at most 

u / u \ 

^O(^v^log(n/20) =oK]](V2)Xlogn-z) = OiVk\og{n/k)), 

i=l \i=l / 

as is easily verified. This establishes the bound on the crossing number of i. 

Running time. Computing Qi can be done in the dual plane, by constructing the convex 
hulls of the levels 2* and n — 2* in the arrangement of the lines dual to the points of Pj-i. 
We use the algorithm of Matousek |Mat91a] . which constructs the convex hull of a level in 
0(|Pj_i| log'^n) time, for a total of O(nlog^n) time. This also subsumes the time needed to 
construct Pj from Pj_i, for all i. Next, we carry out the constructive proof of Lemma [3.31 
for Pj„i \ Qi, which can be implemented to run in 0(nlog*^^^^ n) time. Finally, for each set 
in the cover, we apply the algorithm of |Wel92] to construct the corresponding subtree with 
low crossing number. This takes 0{m^~^'^) time for a set of size m, for any fixed e > 0. We 
continue in this fashion, as described in the first part of the proof. It is now easy to verify 
that the resulting construction takes overall 0{n^^'^) time, for any fixed e > 0. ■ 
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3.1.1 The underlying partition and a counterexample in three dimensions 



The main technical step in the construction of Theorem 13.51 is the partition of the set of 
k-shallow points of P, namely, those that are contained in some halfplane with at most k 
points of P, into subsets, each containing at most 2k points, so that any line crosses the 
convex hulls of at most 0{log{n/k)) of these subsets. 

It is natural to try to extend this construction to three (or higher) dimensions. However, 
as we show next, there are examples of point sets in where no such partition exists, even 
when the points are in convex position. 

To see this, let m be an integer, and put n = m?. Define 

P = {{hJ:^^ +f) M,J = l,•••,"^}• 
That is, P is the set of the vertices of the m x m integer grid in the xy-plane, lifted to the 
standard (convex) paraboloid z = x'^ + y'^- Thus, all points of P are in convex position (and 
are thus 1-shallow). 

Let A; > 1 be an arbitrary parameter, and consider any partition CP = {Qi, . . . ,Qu} of 
P into u = Q{n/k) sets, where k < \Qi\ < 2k, for i = l,...,u. Let Cj denote the two- 
dimensional convex hull of the projection of Qi onto the xy-plane, for i = 1, . . . ,u. The 
sum of the x-span and the ?/-span of Cj is at least y^jQij (or else there would be no room 
for Qi to contain all its points). Hence, the total length of these spans of Ci,...,Cu is 
at least Vl{[n/k)\/k) = Vt{n/^/k). Consider the 2m — 2 vertical planes x = 1 + 1/2, x = 
2 + 1/2,. . .,x = m- 1 + 1/2, and y = 1 + 1/2, y = 2 + 1/2, ...,?/ = m - 1 + 1/2. Clearly, the 
overall number of intersection points between the boundaries of the Cj's and these planes is 
proportional to the sum of their x-spans and ?/-spans. Hence, there is a plane in this family 
that intersects il{{n/ y/k) / ^/n) = Vt[^Jn/k) sets among Ci, . . . , C^, and thus it intersects the 
Vt{^Jn/k) corresponding convex hulls among C'H(Qi), . . . , C'H{Qu)- 

Note that a similar argument can be applied to the set of the vertices of the m^/^ x 
^ 77^1/3 integer lattice. In this case, for any partition of this set of the above kind, there 
always exists a plane that crosses the convex hulls of at least il{(n/ky^^) of the subsets. 
(This matches the upper bound in the partition theorem of Matousek |Mat92] .) Of course, 
here, (most of) the points are not shallow. 

To summarize, there exist sets P of n points in convex position in 3-space (so they are all 
1-shallow), such that any partition of P into sets of size (roughly) k will have a plane that 
crosses at least Vt{^n/k) sets in the partition. (Without the convex position, or shallowness, 
assumption, there exist sets for which this crossing number is at least VL{{n/kY^^).) We do 
not know whether this lower bound is worst-case tight. That is, can a set P of n /c-shallow 
points in 3-space be partitioned into Q{n/k) subsets, so that no plane separates more than 
0{\Jn/k) subsets? If this were the case, applying the standard construction of spanning trees 
with small crossing numbers to each subset would result in a spanning tree with crossing 
number 0{n^^'^k^/^). 

Note that this still leaves open the (more modest) possibility of a partition which is 
"depth sensitive", that is, a partition into subsets of size roughly fc, with the property that 
any halfspace that contains m points crosses at most (or close to) 0((m/A;)^/^) sets in the 
partition. 
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3.2 Relative {p, £)-approximations for halfplanes 

We can turn the above construction of a spanning tree with small relative crossing number 
into a construction of a relative {p, £:)-approximation for a set of points in the plane and for 
halfplane ranges, as follows. 

Let P be a set of n points in the plane, and let T be a spanning tree of P as provided in 
Theorem 13.51 We replace T by a perfect matching M of P, with the same relative crossing 
number, i.e., the number of pairs of M that are separated by a halfplane of weight k is at 
most 0{Vk\og{n/k)). This is done in a standard manner — we first convert T to a spanning 
path whose relative crossing number is at most twice larger than the crossing number of T, 
and then pick every other edge of the path. 

We now construct a coloring of P with low discrepancy, by randomly coloring the points 
in each pair of M. Specifically, each pair is randomly and independently colored either 
as — 1,+1 or as +1,-1, with equal probability. The standard theory of discrepancy (see 
|Cha01] ) yields the following variant. 

Lemma 3.6 Given a set P of n points in the plane, one can construct a coloring x '■ P ^ 
{ — 1, 1}, such that, for any halfplane h, 

xihnP) = 0{\hr]P\^^Hogn). 

The coloring is balanced — each color class consists of exactly n/2 points of P. 



Proof: As shown in jMat99J . if a halfplane h crosses t edges of the matching, then its 



n 



discrepancy is 0(-y/t logn), with high probability. As shown above, t = 0{\/k\og{n/k)), for 
k = \hn P\, so the discrepancy of h is, with high probability, y/k\og{n/k) log 

0(A;i/Mogn). ■ 
We need the following fairly trivial technical lemma. 

Lemma 3.7 For any x > 0, y > 0, and < p < 1, we have x^ <{x + y) /y^~^ ■ 

Proof: Observe that x^ < ix + yY = ^ — < — -, • ■ 

As we next show, the improved discrepancy bound of Lemma 13.61 leads to an improved 
bound on the size of (z/, a)-samples for our range space, and, consequently, for the size of 
relative {p, 5)-approximations. 

Theorem 3.8 Given a set P of n points in the plane, and parameters < a < 1 and 
< u < 1, one can construct a (z/, a) -sample Z C P of size O ^ ^4)3^ log^^'^ . 

Proof: Following one of the classical constructions of e- approximations (see |Cha01] . we 
repeatedly halve P, until we obtain a subset of size as asserted in the theorem, and then 
argue that the resulting set is a (z/, a)-sample. Formally, set Pq = P, and partition Pj_i into 
two equal halves, using Lemma 13. 6j let Pj and P/ denote the two halves (consisting of the 
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points that are colored +1, —1, respectively). We keep P,, remove and continue with the 
halving process. Let rii = |-P| /2* denote the size of Pj. For any halfplane h, we have 



IP nh\- IP' n h\ 



< c \hn Pi\^^^ logn,, 



where c is some appropriate constant. Recalling that 



P 



and that I PI = |P/|, this can be rewritten as 



P' 



_ , Pf/i)i/^ 

\m)-Pl{h)\<c^^\ogn,. 



Since Pj_i = Pj U P/, we have 

|/inp,_i| _ \hnPi\ , \hnPl\ _ i.— , — , 



Pi^iih) 
Since z/ > 0, we have 



l^.-il 



2IPI 



2 IP' I 



iP,{h) + Pl{h)) 



P,^,{h) - P,{h) 



\P,ih) - Plih)\ cP,{h) 



1/4 



3/4 



logni. 



2 2n,^ 

Applying Lemma \3l7\ with p = 1/4, x = Pi{h), and y = the last expression is at most 
clogrii Pi(/i) + . clogrij 



2n; 



3/4 



3/4 



< 



2(un, 



\3/4 



(P,_i(/i)+P,(/i) + i.) 



log Tlj^ 

This implies that (i^(Pj_i(/i) ,Pi{h)) < — The triangle inequality then implies that 



2(un, 



d,{P{h),P,{h)) < J2du{Pk^i{h),P,{h)) < 



fe=i 



2;y3/4 ^ ^3/4 ^1^,^.^3/4 I - 



for rii = Q ^ ^^4/3 log^^^ . The theorem then follows by taking Z to be the smallest P, 

which still satisfies this size constraint. ■ 
Using Theorem 12.91 we thus obtain: 

Corollary 3.9 Given a set P of n points in the plane, and parameters < e < 1 and 
< p < 1, one can construct a relative {p, e) -approximation Z C P of size O ( -^^jj^ log^''^ ^ 
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Remark: One can speed up the construction of the relative (p, e)-approximation of Corol- 
lary 13. 9[ by first drawing a random sample of slightly larger size, which is guaranteed, with 
high probability, to be a relative approximation of the desired kind, and then use halving to 

decimate it to the desired size. Implemented carefully, this takes O + lognj ^ time, 

and thus produces a relative (p, e)-approximation, with high probability, of the desired size 

In fact, the preceding analysis leads to an improved bound for sensitive approximations 
for our range space. The improvement is in terms of the quality of the "sensitivity" of the 
approximation, which is achieved at the cost of a slight increase (by a sublogarithmic factor) 
in its size, as compared to the standard bound, provided in Theorem 12.131 That is, we have: 



Theorem 3.10 Let P he a set of n points in the plane, and let e > be a parameter. One 
can compute a subset Z C P of size O ( (1/e^) log^^^(l/£:) ) , such that for any halfplane h, 



we have |P(/i) - Z{h)\ < ^ {e''/^P{hf^ + 

Proof: Fix parameters < a < 1 and z/ > 0, and Apply the construction of Theorem 13.81 

until we get a subset Z of size m = O ^(1/e^) log^'^^(l/£:) j ; the constant of proportionality 

will be determined by the forthcoming considerations. 

The key observation is that Z is a {v, a)-sample for any < a < 1 and > that satisfy 

ra = Vt \ —775- log^^'^ — ) , because the construction is oblivious to the individual values of a 

and z/, and just requires that the size of the sample remains larger than the above threshold. 

Using this observation, we proceed to show that Z satisfies the property asserted in the 
theorem. So let h he & halfplane. Suppose first that P{h) < By Corollary 13. 9[ Z 

is a relative (e^, l/2)-approximation to P, with an appropriate choice of the constant of 
proportionality (note the change of roles of "p" and "e"). This implies that \Z{h) — P{h)\ < 

Suppose then that ph = P{h) > and set Eh = e^^'^/p'^^^. Observe that 

1 log4/3 J_ = o(\ log^/3 I] = 0(m) 



Bh^^^Ph £hPh e 

Thus, with an appropriate choice of the constant of proportionality for m, Z is a. relative 
(p/i, £^ft,/2)-approximation, which implies that 

1 £3/2 1 



\P{h) - Z{h)\ < -e,P{h) = -jj-,P{h) = -e"'P{h) 



1/4 



as asserted. 



This compares favorably with the result of Bronnimann et al. |Bro95t IBCMQQj , which in 
this case implies that there exists a subset of size ©((l/e^) log(l/e)) such that, for each half- 
plane /i, we have |P(/;.) — Z(/;,) | < (e/2) ^P(/i)^/2 + Our bound is smaller when P(/;,) > 



and is the same otherwise. 
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4 Relative {p^ e)-approximations in higher dimensions 



4.1 Relative £)-approximations in 

The construction in higher dimensions is different from the planar one, because of our present 
inability to extend the construction of spanning trees with low relative crossing number to 
three or higher dimension. For this reason we use the following different strategy. 

We say that a hyperplane h separates a set Q C M"' if h intersects the interior of C'H{Q)] 
that is, each of the open halfspaces that h bounds intersects Q. 

The main technical step in the construction is given in the following theorem. 

Theorem 4.1 Let P he a set of n points in M? , and let Q < e < 1, Q < p < 1 he given 

parameters. Then there exists a set Z (1 P, of size 0\ , log^^^ — V such that, for any 

\e-^i^p ep J 

halfspace h with P{h) < p, we have 

\Z{h) -T{h)\<ep. (4) 



Remark: Let us note right away the difference between Eq. and the situation in the 
preceding sections. That is, up to now we have handled ranges of measure at least p, whereas 
Eq. dl]) applies to ranges of measure at most p. This issue requires a somewhat less standard 
construction, that will culminate in a sequence of approximation sets, each catering to a 
different range of halfspace measures. Nevertheless, the overall size of these sets will satisfy 
the above bound, and the cost of accessing them will be small. 

Proof: Put k = [npj, and apply the shallow partition theorem of Matousek jMatQlbj . to 
obtain a partition of P into s < n/k = 0{l/p) subsets Pi, . . . ,Ps, each of size between k + 1 
and 2k, such that any k-shallow halfspace h (namely, a halfspace that contains at most k 
points of P) separates at most c log s subsets, for some absolute constant c. (Note that if h 
meets any Pi, it has to separate it, because h is too shallow to fully contain Pi.) Without 
loss of generality, we can carry out the construction so that the size of each Pi is even. 

We then construct, for each subset Pi, a spanning tree of Pi with crossing number 
0(A;2/3) |CW89l IWel92j . and convert it, as in the preceding section, to a perfect match- 
ing of Pi, with the same asymptotic bound on its crossing number, which is the maximum 
number of pairs in the matching that a halfspace separates. We combine all these perfect 
matchings to a perfect matching of the entire set P. 

We then color each matched pair independently, as above, coloring at random one of 
its points by either —1 or +1, with equal probabilities, and the other point by the opposite 
color. Let Ri be the set of points colored —1; we have |-Ri| = n/2. With high probability, the 
discrepancy of any halfspace h is at most -\/ Q^{h) ln(2?7,), where ^{h) is the crossing number 
of h (see |Cha01] ): we may assume that the coloring does indeed have this property. (If we 
do not care about the running time, we can verify that the constructed set has the required 
property, and if not regenerate it.) 

Hence, if /i is a /c-shallow halfspace, then, by construction, ^{h) = 0(A;^/^ logs), because 
h separates O(logs) subsets and crosses 0(/c^/^) edges of the spanning tree of each of them. 
Hence the discrepancy of any /c-shallow halfspace h is 0(A;^/'^ logn). 
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We continue recursively in this manner for j steps, producing a sequence of subsets 
Rq = P, Ri, . . . , Rj, where Ri is obtained from Ri_i by applying the partitioning of |Mat91b] 
with a different parameter /cj_i, and then by using the above coloring procedure on the 
resulting perfect matching. We take ki-i = /cmin{c/2*~^, 1}, where c is the constant derived 
in the following lemma. (The bound asserted in the lemma holds with high probability if we 
do not verify that our colorings have small discrepancy, and is worst-case if we do verify it.) 



Lemma 4.2 There exists an absolute constant c such that any pn- shallow half space satisfies, 
for any i < j , 

cpn 



\h n RA < 



lere n 



IP. 



n/2K 



where j is the largest index satisfying Uj > ^In'^^^ ^, whe 
Proof: Delegated to Appendix lA.ll ■ 

The lemma implies that h is /cj-shallow in each of the subsets . . . , so we can use 
the above bound on the discrepancy of h with respect to each of these subsets. (The reader 
can note the similarity between the forthcoming analysis and the proof of Lemma [4.21 ) We 
thus have 



\P{h)-Ri{h)\ 
iRTiih) -R^{h)\ 



\hnP\ -2\hnRi\ 



P\ n 
x{h,R,) _^(k'/'\og{n/2) 



x{h,P) ^ ^^A;i/3logn 



n 



\Ri\ 



n/2 



\Rj^i{h) - Rj{h)\ 



X{h, Rj^ 



R 



0\ 



'2^-^k]Z'^og{n/2^-^) 



n 



Substituting ki = min{c/2*, 1}, for each i, and adding up the inequalities, it is easily 
checked that the last right-hand side dominates the sum (compare with the analysis in the 
proof of Theorem 13. 5p . so we obtain, using the triangle inequality. 



\P{h)-Rj{h) \ = O 



22j/3^1/3 iog(r2/2J-i) 



n 



We choose j to be the largest index for which this bound is at most ep < ek/n. That 



is, 2^ 



O 



log='/2(n/2J) 



Note that this can be rewritten as 



log' 



372^ I , which 



implies that uj > ^ In^''^ i, as required in Lemma [4.2[ provided that e is smaller than some 
appropriate absolute constant. Hence, since j was chosen as large as possible, the size of Rj 
is 



n 
2i 



01 



l og'/'(n/2^ ) 



o 



£3/2p 



Taking Z = R- completes the proof of Theorem 14.11 
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4.1.1 How to obtain a relative approximate count. 



Construction. The preceding construction used a fixed k = pn, and assumed that the 
query halfspace is at most /c-shallow. However, our goal is to construct a subset that caters 
to all halfspaces whose measure is at least some given threshold. While unable to meet this 
goal exactly, with a single subset, we almost get there, in the following manner. Let p be the 
given threshold parameter. We consider the geometric sequence {pt}t>o, where pt = 2*p; the 

last (largest) element is ~ 1/2, and its index is tmax = O(log-). For each t, we construct 



a relative (p^, ce)-approximation for P, as in Theorem 14. H where c is a sufficiently small 
constant, whose value will be determined later. Clearly, the overall size of all these sets is 
dominated by the size of the first set, namely, it is 



We output the entire sequence Zq, Zi, . . ., as a substitute for a single relative (p, £:)-approximation, 
and use it as follows. 

Answering a query. Let h he a. given halfspace, so that w = \h (1 P\ > pn. Let t > 1 be 
the index (initially unknown) for which pt-iu < w < ptU. Thus h is p^n-shallow, and is also 
Ps'T'-shallow, for every s > t. Hence, if we use the set Zg, for each s > t, to approximate w, 
we get, by Theorem 14.11 a count Cg '■= Zs{h) ■ \P\, which satisfies \Cs — w\ < cepsU. In other 
words, we have 



for each s > t. 

To answer the query, we access the sets Zt^^^, Zt^^^-i, . . ., in decreasing order, and find 
the largest index s satisfying 



We return Cg as the desired approximate count. 

Analysis. We claim that Eq. (j5]) must hold at s = t, assuming e < Indeed, since 





Cg — cepsU < w < Cs + cepsU, 



cepsU < -Cs- 



(5) 



^Ptn < w < Ptn, we have 



Ct + ceptu > w > -ptU, or Ct >( - 




Hence, Ct — ceptu > (| — 2ce) ptu. On the other hand, Ct 
Combining these two inequalities, we obtain 



< ceptn + w < (1 + ce)ptn. 



c 



■t - ceptu > 



i-2ce 




1 + ce 



if £ < ^, as assumed. Our choice of s thus satisfies s >t. Moreover, we have 

1 9 

-Cg < Cs — cepsTi < w < Cs + cepsU < -Cs- 
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This determines w, up to a factor of 9, which thus determines the correct index t (up to 
±0(1)). In fact, as our query answering procedure actually does, we do not have to find 
the exact value of t, because we use a smaller value of e in the construction of the sets Zg. 
Specifically, with an appropriate choice of c, we have pt > 2cps, so 



so Cg is an e-approximate count of P fl /i, establishing the correctness of our procedure. (The 
specific choice of c, which we do not spell out, can easily be worked out from the preceding 
analysis.) 

Note that our structure also handles halfspaces h with w = \h (1 P\ < pn. Specifically, 
if we find an index s satisfying Eq. ([5]) then Cg is an e-approximate count of P fl /i, as the 
preceding analysis shows. If no such s is found then we must have w < pin = pn (otherwise, 
as just argued, there would exist such an s and the procedure would find it). In this case 
we have \w — Ci\ < epn, and we return Ci with the guarantee that (a) w < pn, and (b) 
|w — Ci| < epn. 

Note that this constitutes a somewhat unorthodox approach — we have logarithmically 
many sets instead of a single one (although their combined size is asymptotically the same 
as that of the largest one), and we access them sequentially to find the one that gives the 
best approximation. An interesting useful feature of the construction is that, if the given 
halfspace h has weight w that satisfies pt-in < w < ptn, then the approximate counting 
mechanism accesses sets whose overall size is only O ^ ^3^^^^ log^^^ . That is, the larger w 

is, the faster is the procedure. 
To summarize, we have shown: 

Theorem 4.3 Given a set P of n points in M^, and two parameters 0<£:<1, 0<p<l, 
we can construct k = O (log ^ j subsets of P, Zq, Zi, . . . , Z^, of total size O (^^375^ log^''^ ^ j , 
so that, given any halfspace h containing qn points of P, we can find a set Zt that satisfies 



Cg < w + cepgU < w ^ — eptn = w + ept-in < (1 + e)w, 



and, similarly. 



> w — cepgU > w eptu = w — ept^iu > (1 — s)w, 




The (brute-force) time it takes to search for Zt and obtain the count \h (1 Zt\ is 
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4.2 Higher dimensions 

The preceding construction can be generalized to higher dimensions, with some comphca- 
tions. We first introduce the following parameters: 

1 - ^ 2d 

7 = IH where d* = [d/2\, and = — r- 

1 + ^ d + 1 

Note that, for (i>4, 1<7<2 (and tends to 2 as increases), and /i < 2 (and tends to 2 
as d increases). 

The analogous version of Theorem 14.11 is: 



Theorem 4.4 Let P be a set of n points in M.^, d > A, and letQ<e<l,Q<p<l he 

( d^l'^ d \ 

given parameters. Then there exists a set Z (1 P, of size 0{ log — ], such that, for 

ep J 

any pn-shallow halfspace h, we have 

\Z{h)-P{h) \ < ep, 



provided that n = log''/^ ^ 



Proof: As above, put k = \np\ , and apply Matousek's shallow partition theorem |Mat91bj , 
to obtain a partition of P into s = 0{l/p) subsets Pi, . . . , P,, each of size between k + 1 and 
2k, such that any /c-shallow halfspace separates at most c{n/kY subsets, for some absolute 
constant c, where /3 = 1 — l/[d/2\ = 1 — 1/d*. (As above, if h meets any Pi, it has to 
separate it.) Also, we may assume that the size of each Pi is even. 

We then construct, for each subset Pi, a spanning path of Pi with crossing number 
0{k") |Wel92] . for a = 1 — 1/d, convert it to a perfect matching of Pi, with the same 
asymptotic crossing number, and combine all these matchings to a perfect matching of P. 

We then apply the same coloring scheme as in the three-dimensional case, and let Ri be 
the set of points colored by —1; we have |-Ri| = n/2. With high probability, the discrepancy 
of any halfspace h is at most ^y2d^{h) ln2n, where ^ (h) is the number of pairs in the matching 
that h separates |Cha01] . If h is /c-shallow then, by construction, ^(h) = 0(^^k°'{n/kY^. 
Hence the discrepancy of h is 

x{h,P) = 0{Vdk''nHog^^^n), for x = ^{a - (3) , and y = ^(3. 

We continue recursively in this manner for j steps, producing, as above, a sequence of subsets 
Rq = P, Ri, . . . , Rj, where Ri is obtained from Ri_i by applying the partitioning of |Mat91b] 
with a different parameter /cj_i, and then by using the above coloring procedure on the 
resulting perfect matching. We take ki-i = /cmin{c/2*~^, 1}, where c is the constant derived 
in the following lemma. (As in the 3-dimensional case, if we want the proof of the theorem 
to be constructive, we either verify that each half-sample has the desired low discrepancy, 
and then the bounds are worst-case, or else the bounds hold with high probability.) 

Lemma 4.5 There exists an absolute constant c such that any pn-shallow halfspace satisfies, 
for any i < j , 

\hnR,\<'-^, 
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where j is the largest index satisfying Uj = fi^^^log^^^ where nj = \Pj \ = n/2K 

Proof: Delegated to Appendix IA.2I ■ 

As in the 3-dimensional case, the lemma justifies the following chain of inequalities (note 
again the similarity between the proof of the lemma and the analysis below). 



\P{h)-R^{h)\ 



IhnPl -2\hr]Ri\ 



p 



xiK P) _q( Vdk^ny log^/2 
n \ n 



n 



xiKRi) _^lVdkl{n/2)y\oil\n/2) 



\Ri\ 



n/2 



l^i-il 



0\ 



'2^~^Vdk]^^{n/2^"^)y\og^^^{n/2^^^) 
n 



Substituting ki = /cmin{c/2*, 1}, for each i, and adding up the inequalities, the last right- 
hand side dominates, so we obtain 



\P{h)-Rj{h) \ =0\ 



'2(i-^-s/)iv^Pn^ log^/^(n/2^-i) 
n 



Substituting k = pn and the values of x and y, this is equal to 



O 



n 



l-a/2 



We choose the first j so that this bound is at most ep. That is, 

l-a/2 / r-, 



log^^^ Uj 

or \Rj\ = Uj = Q 



ep 



d 



log'^/^ 

s^^pl ep 



We note that the choice of nj satisfies the lower bound constraint in Lemma 14.51 Hence, 
taking Z = Rj completes the proof. ■ 

Obtaining a relative approximate count is done exactly as in the three-dimensional 
case, producing a sequence of approximations, and searching through the sequence for the 
approximation which caters for the correct range of the size oi P (1 h. We thus have the 
following result. 

Theorem 4.6 Given a set P of n points in M.'^, and two parameters 0<£:<1, 0<p<l, 
we can construct k = O (log ^ j subsets of P, Zq, Zi, . . . , Z/^, of total size O (^^^ log^^^ ^ 
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so that, given any halfspace h containing qn points of P, we can find a set Zt that satisfies 



\P{h)-Zt{h)\ < 



eP{h), ifq>p 
ep, ifq<p. 



The (brute-force) time it takes to search for Zt and obtain the count \h (1 Zt\ is 

Discussion. We have two competing constructions, tlie "traditional" one, witli = 
6(^^logij elements (see Section [2]), and the new one, with N' = Q(^^\og^^^ el- 
ements. The new construction is better, in terms of the size of the approximation, when 

log^/^ _ < log - 

sP-pi sp e'^p p 

(for simplicity, we ignore the constants of proportionality). For further simplicity, assume 
that p is not much larger than e, so that log ^ and log ^ are roughly the same, up to some 
constant factor. Then we replace the above condition by 

d^'"^ , „/2 d d ^ d 

\og^'^ — < ^ log — . 

si^py ep e^p ep 

Substituting the values of 7 and /i, and simplifying the expressions, this is equivalent to 

1 d ^ d ( e^ld \ ^^o^W) 

< ^ log — , or j9 > ' 



- ^2 -b^p' - \\og{dle 

This establishes a lower bound for p, above which the new construction takes over. For 
example, for d = A, p has to be ^{e/ log^^^ -). 



5 Approximate range counting in two and three di- 
mensions 

In this section we slightly deviate from the main theme of the paper. Since approximate 
range counting is one of the main motivations for introducing relative [p, e)- approximations, 
we return to this problem, and propose efficient solutions for approximate halfspace range 
counting in two and three dimensions. The solution to the 3-dimensional problem uses rela- 
tive approximations, whereas the solution to the 2-dimensional problem is simpler and does 
not require such approximations. Both solutions pass to the dual plane / space, construct 
a small subset of levels, of small overall complexity, in the arrangement of the dual lines / 
planes, and search through them with the point dual to the query halfplane / halfspace to 
retrieve the approximate count. 
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We first present two solutions for tlie planar case, and then consider the 3-dimensional 
case. 

Let P be a set of n points in the plane in general position, and e > be a prescribed 
parameter. The task at hand is to preprocess P for halfplane approximate range counting; 
that is, given a query halfplane /i, we wish to compute a number /i that satisfies (1 — 
e)\hr\ P\ < fi < {1 + e) \h n P\. We can reformulate the problem in the dual plane, where 
the problem is to preprocess the set £> of lines dual to the points of P for approximate vertical- 
ray range counting queries; that is, given a vertical ray p, we want to count (approximately, 
within a relative error of e) the number of lines of £ that intersect p. Without loss of 
generality, we only consider downward-directed rays. 

Let Aj denote the ith level in the arrangement Ai'Cj); this is the closure of the set of all 
the points on the lines of £ that have exactly i lines of £ passing below them. Each Aj is 
an x-monotone polygonal curve, and its combinatorial complexity (or just complexity) is the 
number of its vertices. 

Lemma 5.1 For integers x > y > 0, the total complexity of the levels c)/yi(£) in the range 
[x, x + y] is O [nx^/^y"^^^) . 

In particular, the average complexity of a level in this range is 0{n{x/yY^^). 

Proof: This result is a strengthening of a similar albeit weaker bound due to Welzl |Wel86j . 
and is implicit in [AndOOl lAAHSW98j . It was recently rederived, in a more general form, in 
an unpublished M.Sc. Thesis by Kapelushnik |Kap08] . We sketch the proof for the sake of 
completeness. 

Consider the primal setting, and connect two points m, f G P by an edge, if the open 
halfplane bounded by the line through u and v and lying below that line contains exactly j 
points of P (we refer to {u,v) as a j-set), where j G [x, x -|- y]. Let E denote the resulting 
set of edges. 

All the edges of E that are j-sets can be decomposed into j + 1 concave chains (see, e.g., 
|AACS98l Dey98| ). Similarly, they can be decomposed into n — j convex chains. Overall, 



the edges of E can be decomposed into at most 

x+y 

J2U + 1) = 0{{x + yf - x^) = 0{y^ + xy) = 0{xy) 



a 



concave chains, and into at most 

x+y 



_ n-j)= 0{ny) 

J=X 

convex chains. Each pair of a convex and a concave chain can intersect in at most two points. 
This implies that the segments of E can cross each other at most a/3 = 0{nxy'^) times. 

On the other hand, consider the (straight-edge plane embedding of the) graph G = (P, E). 
It has n vertices, and m = \E\ edges. By the classical Crossing Lemma (see [PA95j ). it has 
X = VL{m? /n^) crossing pairs of edges (assuming m = Vt{n)). We thus have 

=X = 0{nxy'), 
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or m = 0(nx^^^y'^^^y 

The second claim in the lemma is an immediate consequence of this bound. 



Claim 5.2 (i) For each i < n/2, for < e < 1, and for any fixed positive constant c, let k 
be an integer chosen randomly and uniformly in the range [i, (1 + e/cji]. Then the expected 
complexity of is 0{n/e^^^). This bound also holds, with an appropriate choice of the 
constant of proportionality, with probability > 1/2. 

(a) If Ak has complexity u then it can be replaced by an x-monotone polygonal curve with 
0{u/{ek)) edges, which lies between the two curves Ak(i-e) and Ak(i+£). 

Proof: The first claim is an immediate consequence of Lemma 15.11 by setting x = i and 
y = {e/c)i. The second claim is well known: the curve is obtained by shortcutting in 
"jumps" of ek vertices; see, e.g., [Mat90] . ■ 

We present two variants of an algorithm for the problem at hand, which differ in the 
dependence of their performance on e. The first has 0{log{n/e)) query time, but requires 
0{n/e'^/^) storage, while the second one uses only 0{n) storage (no dependence on e), but 
its query time is 0(log?7, + p-)- 

5.1 Fast query time 

We first compute the union Fq of the first M = ['l/e:''/^] levels of yl(/C). As is well known 
(see, e.g., |CS89] ). the overall complexity of Fq is 0{nM) = 0{n/e'^/^). Next, set rij := 
[M{l + eY\, for z = 0, . . . ,u = 0(logi+^(r2/M)) = O(Mogn). Let c be a constant that 
satisfies {1 + e/cY < 1 + £^ for < e < 1 (c = 12 would do). We pick a random level with 
index in the range Ui/il — e/c), . . . , ?7,j(l + e/c)^; by Claim [5T2] (i) . most of these levels have 
complexity 0{n/ e^^^). We thus assume that the chosen level has this complexity (or else we 
resample; since the probability of success is at least 1/2, this does not affect the expected 
running time). We then simplify each such level, using Claim 1512^ 11) (with e/c instead of e). 
The resulting polygonal curve 7^ is easily seen to lie (strictly) between A„- and A^^^-^, and 
its complexity is 



In particular, the total complexity of the curves 71, ... ,7^ is ^iOy-^^^^^—yj = 0{n/e'^^^), 

and these curves are pairwise disjoint. Together with the segments in Fq, they form a planar 
subdivision Q with 0(n/e^/^) edges, which we preprocess for efficient point location. Using 
Kirkpatrick's algorithm |Kir83] . this can be done with Oiji/e^/^) preprocessing time and 
storage, and a query can be answered in 0(log(n/e)) time. We also store (with no extra 
asymptotic cost) a count with each edge e of Q. It is equal to the level of e if e is an edge 
of Fq, and to rij if e is an edge of 7j. Now, given a query point q (i.e., a downward-directed 
ray emanating from g), we locate g in Q and retrieve the count iVg, where e is the edge lying 
directly below q. It is easy to verify that A^e is indeed an e-approximation of the number of 
lines below q. We have thus shown: 

Theorem 5.3 Given a set P of n points in the plane, and a parameter < £ < 1, one 
can build, in 0{{n/ e^/'^)\o^ n) expected time, a data- structure that uses Oinje^l^) space. 
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so that, given a query halfplane h, one can approximate \hr\P\ within relative error e, in 
0(log(n/£:)) time. 

Proof: The construction is described above, and we only bound the running time. Computing 
the bottom M levels takes 0{nM + nlogn) time |ERvK96] . Computing the remaining 
randomly chosen O (-logn) levels requires 0{{n/ e^^^)\ogn) time per level, using the (very 
involved) dynamic convex hull algorithm of jBJ02] (or simpler earlier algorithms with a slight 
(logarithmic or sub- logarithmic) degradation in the running time). Overall, the running time 
is Oinje^^^ + nlogn + (n/e^/^) log^n). ■ 

5.2 Linear space 

Let M be a random integer in the range [l/£:^,2/e^]. By Lemma 15.1^ the expected com- 
plexity of the level Am is 0(n). By Markov's inequahty, the complexity of ku is 0(n) 
with probability at least 1/2, with an appropriate choice of the constant of proportionality. 
Thus, redrawing the index M if necessary (without affecting the expected asymptotic run- 
ning time), we may assume that Km does have linear complexity. Next, we define the curves 
7o, 7i, . . . , 7„ as above, with the new value of M as the starting index. Note that each 7^ is 
a shortcutting of a random level in the range [i^j, 2(1 + e)Ki\^ where Ki = (1/£:^)(1 + e)'""*^. 
More precisely, we can regard the random choice of the level from which 7^ is produced 
as a 2-step drawing, where we first draw M and then draw k in the "middle" of the range 
[M{l + ey, M{l + ey~^^], as above. The combined drawing is not exactly uniform, but is close 
enough to make Lemma 15.11 and Claim 15.21 hold in this scenario toojj Hence, the expected 
complexity of the level corresponding to 7j is 0{n) for each i. Thus, the overall expected 
complexity of the shortcut curves 70, 71, . . . , 7^ is now only 

(with a constant of proportionality independent of e). We construct the collection of these 
curves, and assume (using resampling if necessary) that their overall complexity is indeed 
linear. We preprocess the planar map formed by these curves, and by the edges of Am, for 
fast point location, as above, and store with each curve 7^, for i > 0, the level rij that it 
approximates. In addition, we sweep Am from left to right, and store, with each of its edges 
e, the (fixed) set of lines passing below (any point on) e. This can be done with only 0{n) 
storage, using persistence |ST86] . Now, given a query point q, we locate it in the planar 
map. If it lies above 70, then the index stored at the segment lying directly below q is an 
e- approximation of the number of lines below q. If q lies below 70, we find the edge e of 
Am lying above or below q, retrieve the set of lines stored at e (using the persistent data 
structure), and search it, in 0(l/£:^) time, to count (exactly) the number of lines below q. 
We thus have shown: 

Theorem 5.4 Given a set P of n points in the plane, and a parameter < £ < 1, one 

''Technically, in Lemma |5. II we assume y < x, and here we have y = x(l + 2e), but the lemma continues 
to hold in this case too, as is easily checked. 
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can build, in O(^log^n) time, a data- structure that uses 0{n) space, so that, given a query 
halfplane h, one can approximate \h fl P\, within relative error e, in 0{logn + l/s"^) time. 

Proof: The construction requires the computation of O(Mogn) levels, each of expected 
complexity 0{n). Thus, this takes O(-log^n) time (or slightly worse, as in the comment 
in the preceding proof). The query time and space complexity follow from the discussion 
above. ■ 

Observe that Theorem 15.31 and Theorem 15.41 improve over the previous results in |AH08t 
IKS06] . which have query time ^ log^n). 

It would also be interesting to compare these results to the recent technique of Aronov 
and Sharir [AS08] : as presented, this technique caters only to range searching in four and 
higher dimensions, but it can be adapted to two or three dimensions too. 

5.3 Approximate range counting in three dimensions 

We can extend the above algorithms to three dimensions. After applying duality, the input 
is a set H oi n planes in 3-space, which we want to preprocess for approximate vertical ray 
range counting. The general idea is very similar: (i) Compute a sequence of levels of A{H), 
whose indices form roughly a geometric sequence, (ii) Replace each level by a simplified 
xy-monotone polyhedral surface which approximates it well, (iii) Find the belt between two 
consecutive surfaces which contains the query point q (the apex of the query vertical ray), 
and thereby obtain the desired approximate count. Implementing step (ii) is considerably 
harder in three dimensions than in the plane, and we do it using an appropriate relative 
(p, £:)-approximation. 

For the sake of simplicity of presentation, we do not attempt to optimize the choice of 
parameters, and just describe the general technique. Concrete and improved versions can be 
worked out by the interested reader. 

Approximating a specific level. Consider first the problem of approximating a specific 
level m of A{H). Consider the range space that has H as the ground set, whose ranges 
are induced by vertical downward-directed rays, where the range associate with a ray p 
is the subset of planes of H crossed by p. This range space has finite VC-dimension, so 
we can apply to it the analysis of Section |2j Put p = m/n, and construct a {p,e/3)- 
relative approximation B 'O H, hj taking a random sample of size 0{{\ogn)/{e'^p)) from 
H (see Theorem 12.111 and |LLS01] ): With high probability, the sample is indeed such an 
approximation. Set u = = 0{e~'^\ogn). By construction, the z/th level A^, of A{B) 

is guaranteed to lie between the levels (1 — e/3)m and (1 + e/3)m of A{H), so it provides 
an adequate approximation to the mth level of A{H). Since v is "small", we can compute 
K in time 0{{n/e^^^^) log'^^^^ n), using, e.g., the algorithm of [ChaOOj . 

Approximating all levels. We first compute explicitly all the \/e bottom levels of A{H). 
Their overall complexity is 0{n/e^) |CS89j . and their construction takes Oinje^ + nlogn) 
time [ChaOOj . 

Next, we approximate each of the levels = (1/£:)(1 + 5)*, up to relative error of ±e/3, 
using the algorithm described above, for z = 1, . . . , 0((log?7.)/£). 
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This results in a sequence of 0{e~^\ogn) pairwise disjoint xy-monotone polyhedral sur- 
faces in (i.e., the exact 0{e~^) bottom levels, and the additional 0{e~^\ogn) approxi- 
mated levels). We need to store these surfaces so that, given a query point q, the two surfaces 
which lie directly above and below q can be found efficiently. This is done using binary search 
through the sequence of surfaces, where each step of the search is implemented by locating 
the xy-projection q* of q in the xy-projection of a surface (which is a planar map), and then 
by testing q against the plane inducing the face containing q*. Thus the cost of a query is 
0(log(e~^ logn) ■ \og{e~^n)). The index of the surface directly below q (namely, either its 
exact level if it is one of the first bottom levels, or the index of the level of A{H) that it 
approximates) yields the desired approximate count. 

The preceding analysis is easily seen to imply that the overall storage and preprocessing 
cost of the algorithm are both 0{{n/e'^^^^) log*^*-^^ n). (Concrete and reasonably small values 
of the powers of the polylogarithmic factor and of the factor l/e'^^^^ can be easily worked 
out, but we skip over this step.) Hence we obtain the following result. 

Theorem 5.5 Given a set P of n points in three dimensions and a parameter a < e < 1, 
one can build a data- structure, in 0{{n/ e'^^^^) log*^^^"^ n) time and space, so that, given a 
query half space h, one can approximate \Pr\h\, up to relative error of ±e. The query time 
is 0(log(£~^ logn) ■ \og{e~^n)) . 

As in the planar case. Theorem 15.51 improves over the previous results [AH08t IKSOGj . 
which require (^log^n) time to answer a query. However, in a subsequent work, Afshani 
and Chan [AC09] managed to obtain an improved solution. Specifically, they show that, 
with Oe(nlogn) expected preprocessing time, one can build a data structure of expected 
size Os{n) which can answer approximate 3-dimensional halfspace range counting queries in 
(log (n//c*)) expected time, where k* is the actual value of the count, and hides constant 
factors that are polynomial in 1/e. It would also be interesting to compare our result to the 
appropriate variant of the technique of |AS08] . 

6 Conclusions 

In this paper we first established connections between the (z/, a)-samples of Li et al. [LLSOl] 
and relative (p, 5)-approximations (and other notions of approximation^). This has allowed 
us to establish sharp upper bounds on the size of relative (p, e)- approximations in arbitrary 
range spaces of finite VC-dimension. We then turned to study geometric range spaces, 
and gave a construction of even smaller-size relative approximations for halfplane ranges, 
by revisiting the classical construction of spanning trees with low crossing number, and 
by modifying it to be "weight-sensitive". We then gave similar constructions of "almost" 
relative approximations for halfspace ranges in three and higher dimensions, using a different 
approach. Finally, we have also revisited the approximate halfspace range-counting problem 
in two and three dimensions, and provided better algorithms than those previously known. 

There are several interesting open problems for further research. The main one is to 
extend the construction of spanning trees with small relative crossing number to three and 

^Which we did at no extra charge! 



30 



higher dimensions. Another open problem is to improve Theorem 13. 5[ A minor further 
improvement of Theorem 13.51 is possible by plugging the construction of Theorem 13.51 into 
the construction of Lemma This still falls short of the desired spanning tree with crossing 
number 0{y/we), for a line i of weight wg. We leave this as an open problem for further 
research. 

Interestingly, the partition of Lemma 13.31 can be interpreted as a strengthening of the 
shallow partition theorem of Matousek |Mat91bj in two dimensions. It is quite possible that 
a similar (but probably weaker) strengthening is possible in three and higher dimensions. 
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A Proofs of some lemmas 
A.l Proof of Lemma [472] 

Proof: Put k = [pn\ and let ki be as defined in the proof of Theorem 14.11 Put Xi = \Ri f] h\, 
for i = 0, . . . , j (so Ao = l-P n h\). We prove the inequality in the lemma by induction on i, 
which continues as long as n,- > - In'^^^ -: the induction will dictate the correct choice of c. 
The claim is trivial for i = 0, if we choose c > 1. Assume then that the inequality holds for 
each t < i, and consider Aj. We apply the improved discrepancy bound, given in the proof of 

Theorem 14. H to for each t = 1, . . . this holds because Aj_i < kt-i, by the induction 

1 /s 

hypothesis. We thus have |At-i — 2\t\ = 0{k^!^^\ognt-i), or 

\2'-^\t-i - 2*At| < 2''^cklLl hgrit-i, 

for some absolute constant c. Adding these inequalities, for t = 1, . . . , i, we obtain (using 
the induction hypothesis) 

i 1 

|Ao-2U,| < ^|2*-^At_i-2%| < ^2*cA;y^lognt 
t=i t=o 
< c'c'/'2''/'k'/Hogn„ 

for some absolute constant c'. Hence, since Aq < k, we have 

k c'c^/^A;^/^ logra,- ck 
Ai < — H < — , 

if we choose c to be a sufficiently large constant, satisfying 1 -|- c'c^^^ < c, and if we assume 
that /c/2* > log^'^^rij, or ni/log^''^ni > 1/p, which holds, for any i < j, by the assumptions 
of the lemma. ■ 



A. 2 Proof of Lemma 14.51 

Proof: We proceed in much the same way as in the preceding proof. That is, put Aj = 
\Ri n h\, for i = 0, . . . , j (so Aq = \Pr\h\), and use induction on i, which continues as long as 
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rij = f2 log'^^^ . The claim is trivial for z = 0, if we choose c > 1. Assume then that the 

inequality holds for each t < i, and consider Aj. We apply the improved discrepancy bound, 
given in the proof of Theorem \AA\ to Pt^i, for each t = 1, . . . ,i] this holds because At_i < 
kt-i, by the induction hypothesis. We thus have \Xt-i — 2At| = 0{y/dkf_^n^_^log^^'^ rit-i), 
or 

for some absolute constant c. Adding these inequalities, for t = 1, . . . ,z, we obtain (using 
the induction hypothesis) 

i 

|Ao-2'A,| < 5^|2*-iAi„i-2%| < 

i=l 

J22'cVdk^ny\og^/^nt < c'v^c^2(^-^-^)^rnnog^/2 
t=o 

for some absolute constant c'. Hence, since Aq < k, we have 

, k d y/dc^k^ny log^^"^ Ui ck 
Aj < — H TT- , which we want to be < — . 

To guarantee the last inequality, we choose c to be a sufficiently large constant, satisfying 
1 + c'c'^' < c, and require that 

Vdk^ny log^^^ rii ^ k 
Substituting k = pn and the values of x, y, this amounts to requiring that 

Vdlog'/'n, < = n^'V"", or 

\ P J \ p 

This holds, for any z < j, by the assumptions of the lemma. 
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